Prediction and Clustering of Bank Customer Churn Based on XGBoost and K-means
نویسندگان
چکیده
Due to the fierce competition of commercial banks, customers are becoming more and important banks. Therefore, customer churn has become a major problem that banks need face. In this paper, XGboost algorithm was used on data set US bank from Kaggle predict churn, grid search method find best hyperparameters. Moreover, K-means is adopted further subdivide lost customers. For predicting XGBoost achieves 0.84 in accuracy, 0.83 precision, recall F1 score test set. And most for features case customers' estimated salary, credit balance. segmentation customers, divides these into 5 groups. These five groups have different values so paper puts forward corresponding recovery suggestions their respective characteristics
منابع مشابه
Hierarchical Alpha-cut Fuzzy C-means, Fuzzy ARTMAP and Cox Regression Model for Customer Churn Prediction
As customers are the main asset of any organization, customer churn management is becoming a major task for organizations to retain their valuable customers. In the previous studies, the applicability and efficiency of hierarchical data mining techniques for churn prediction by combining two or more techniques have been proved to provide better performances than many single techniques over a nu...
متن کاملPersistent K-Means: Stable Data Clustering Algorithm Based on K-Means Algorithm
Identifying clusters or clustering is an important aspect of data analysis. It is the task of grouping a set of objects in such a way those objects in the same group/cluster are more similar in some sense or another. It is a main task of exploratory data mining, and a common technique for statistical data analysis This paper proposed an improved version of K-Means algorithm, namely Persistent K...
متن کاملHybrid Models Using Unsupervised Clustering for Prediction of Customer Churn
In this paper, we use two-stage hybrid models consisting of unsupervised clustering techniques and decision trees with boosting on two different data sets and evaluate the models in terms of top decile lift. We examine two different approaches for hybridization of the models for utilizing the results of clustering based on various attributes related to service usage and revenue contribution of ...
متن کاملRanking and Clustering Iranian Provinces Based on COVID-19 Spread: K-Means Cluster Analysis
Introduction: The Coronavirus has crossed geographical borders. This study was performed to rank and cluster Iranian provinces based on coronavirus disease (COVID-19) recorded cases from February 19 to March 22, 2020. Materials and Methods: This cross-sectional study was conducted in 31 provinces of Iran using the daily number of confirmed cases. Cumulative Frequency (CF) and Adjusted CF (ACF)...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: BCP business & management
سال: 2022
ISSN: ['2692-6156']
DOI: https://doi.org/10.54691/bcpbm.v23i.1373